3 research outputs found
Consistency and Variation in Kernel Neural Ranking Model
This paper studies the consistency of the kernel-based neural ranking model
K-NRM, a recent state-of-the-art neural IR model, which is important for
reproducible research and deployment in the industry. We find that K-NRM has
low variance on relevance-based metrics across experimental trials. In spite of
this low variance in overall performance, different trials produce different
document rankings for individual queries. The main source of variance in our
experiments was found to be different latent matching patterns captured by
K-NRM. In the IR-customized word embeddings learned by K-NRM, the
query-document word pairs follow two different matching patterns that are
equally effective, but align word pairs differently in the embedding space. The
different latent matching patterns enable a simple yet effective approach to
construct ensemble rankers, which improve K-NRM's effectiveness and
generalization abilities.Comment: 4 pages, 4 figures, 2 table
Selective Weak Supervision for Neural Information Retrieval
This paper democratizes neural information retrieval to scenarios where large
scale relevance training signals are not available. We revisit the classic IR
intuition that anchor-document relations approximate query-document relevance
and propose a reinforcement weak supervision selection method, ReInfoSelect,
which learns to select anchor-document pairs that best weakly supervise the
neural ranker (action), using the ranking performance on a handful of relevance
labels as the reward. Iteratively, for a batch of anchor-document pairs,
ReInfoSelect back propagates the gradients through the neural ranker, gathers
its NDCG reward, and optimizes the data selection network using policy
gradients, until the neural ranker's performance peaks on target relevance
metrics (convergence). In our experiments on three TREC benchmarks, neural
rankers trained by ReInfoSelect, with only publicly available anchor data,
significantly outperform feature-based learning to rank methods and match the
effectiveness of neural rankers trained with private commercial search logs.
Our analyses show that ReInfoSelect effectively selects weak supervision
signals based on the stage of the neural ranker training, and intuitively picks
anchor-document pairs similar to query-document pairs.Comment: Accepted by WWW 202